Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use list instead of regex for string split #509

Closed
wants to merge 2 commits into from

Conversation

ypconstante
Copy link
Contributor

Today the class and attribute selector are using String.split with a regex, but using a list of separators instead of regex is faster.

Name                    ips        average  deviation         median         99th %
pr                   668.99        1.49 ms    ±15.56%        1.43 ms        2.39 ms
today                528.96        1.89 ms    ±13.51%        1.81 ms        2.84 ms

Comparison: 
pr                   668.99
today                528.96 - 1.26x slower +0.40 ms

Memory usage statistics:

Name             Memory usage
pr                    1.92 MB
today                 2.09 MB - 1.08x memory usage +0.161 MB
read_file = fn name ->
  __ENV__.file
  |> Path.dirname()
  |> Path.join(name)
  |> File.read!()
end

html_input = read_file.("small.html")

[{"html", _, _} = html | _] = Floki.parse_document!(html_input)

Benchee.run(
  %{
    "bench" => fn -> Floki.Finder.find(html, ".mw-parser-output > p")  end
  },
  time: 5,
  memory_time: 2
)

@ypconstante
Copy link
Contributor Author

Replaced by #510

@ypconstante ypconstante deleted the optimize-string-split branch December 28, 2023 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant